A Bad Instance for k-Means++
نویسندگان
چکیده
k-means++ is a seeding technique for the k-means method with an expected approximation ratio of O(log k), where k denotes the number of clusters. Examples are known on which the expected approximation ratio of k-means++ is Ω(log k), showing that the upper bound is asymptotically tight. However, it remained open whether k-means++ yields an O(1)-approximation with probability 1/poly(k) or even with constant probability. We settle this question and present instances on which k-means++ achieves an approximation ratio of (2/3−ε) · log k only with exponentially small probability.
منابع مشابه
Identi cation of Bad Signatures in BatchesJaros
The paper addresses the problem of bad signature identii-cation in batch veriication of digital signatures. The number of generic tests necessary to identify all bad signatures in a batch instance, is used to measure the eeciency of veriiers. The divide-and-conquer veri-er DCV(x; n) is deened. The veriier identiies all bad signatures in a batch instance x of the length n by repeatedly splitting...
متن کاملIdentification of Bad Signatures in Batches
The paper addresses the problem of bad signature identification in batch verification of digital signatures. The number of generic tests necessary to identify all bad signatures in a batch instance, is used to measure the efficiency of verifiers. The divide-and-conquer verifier DCVα(x,n) is defined. The verifier identifies all bad signatures in a batch instance x of the length n by repeatedly s...
متن کاملThe global Minmax k-means algorithm
The global k-means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable initial positions, and employs k-means to minimize the sum of the intra-cluster variances. However the global k-means algorithm sometimes results singleton clusters and the initial positions sometimes are bad, afte...
متن کاملA bad 2-dimensional instance for k-means++
The k-means++ seeding algorithm is one of the most popular algorithms that is used for finding the initial k centers when using the k-means heuristic. The algorithm is a simple sampling procedure and can be described as follows: Pick the first center randomly from among the given points. For i > 1, pick a point to be the i center with probability proportional to the square of the Euclidean dist...
متن کاملA sharp threshold for a random constraint satisfaction problem
We consider random instances I of a constraint satisfaction problem generalizing k-SAT: given n boolean variables, m ordered k-tuples of literals, and q “bad” clause assignments, find an assignment which does not set any of the k-tuples to a bad clause assignment. We consider the case where k = Ω(log n), and generate instance I by including every k-tuple of literals independently with probabili...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Theor. Comput. Sci.
دوره 505 شماره
صفحات -
تاریخ انتشار 2011